Data Cleaning

Exploratory Data Analysis

Department Aisle Product

Reorder Ratio

Customer segmentation

Association rules

Feature Engineering

Total number of orders

Product ordered in last order

Average products per order

Product total counts

Product reorder ratio

Aisle reorder ratio

Department reorder ratio

The number of times the user bought a certain product

Average add_to_cart position

Average days since prior order

Is Organic or not

Department weight index

Complie all features together

Modeling

Sampling

Since our classifier ratio is around 9:1, it is imbalanced. We need to transfer it into balanced form. We first applied SMOTE techniques for the minoriority class of reorder, then undersampled the majority class of not-reorder.

In this way our data is balanced right now, the ratio is around 5:3.

Logistic Regression

Gaussian Naive Bayes

Decision Tree

KNN or k-Nearest Neighbors

Stochastic Gradient Descent

Random Forest

Gradient Boosting Classifier

Prediction